Shuffle an array [EPI]¶
Time: O(N); Space: O(N); medium
Shuffle a set of numbers without duplicates.
Example 1:
Init an array with set 1, 2, and 3. nums = [1, 2, 3] solution = Solution(nums)
Shuffle the array [1,2,3] and return its result. Any permutation of [1,2,3] must equally likely to be returned. solution.shuffle()
Resets the array back to its original configuration [1,2,3]. solution.reset()
Returns the random shuffling of array [1,2,3]. solution.shuffle()
Initial Thoughts¶
Normally I would display more than two approaches, but shuffling is deceptively easy to do almost properly, and the Fisher-Yates algorithm is both the canonical solution and asymptotically optimal.
A few notes on randomness are necessary before beginning - both approaches displayed below assume that the languages’ pseudorandom number generators (PRNGs) are sufficiently random. The sample code uses the simplest techniques available for getting pseudorandom numbers, but for each possible permutation of the array to be truly equally likely, more care must be taken. For example, an array of length nn has n! distinct permutations. Therefore, in order to encode all permutations in an integer space, [lg(n!)⌉ bits are necessary, which may not be guaranteed by the default PRNG.
1. Brute Force [O(N^2), O(N)]¶
Intuition If we put each number in a “hat” and draw them out at random, the order in which we draw them will define a random ordering.
Algorithm The brute force algorithm essentially puts each number in the aforementioned “hat”, and draws them at random (without replacement) until there are none left. Mechanically, this is performed by copying the contents of array into a second auxiliary array named aux before overwriting each element of array with a randomly selected one from aux. After selecting each random element, it is removed from aux to prevent duplicate draws. The implementation of reset is simple, as we just store the original state of nums on construction.
The correctness of the algorithm follows from the fact that an element (without loss of generality) is equally likely to be selected during all iterations of the for loop. To prove this, observe that the probability of a particular element e being chosen on the kkth iteration (indexed from 0) is simply P(e being chosen during the kth iteration) x P(e not being chosen before the kth iteration). Given that the array to be shuffled has nn elements, this probability is more concretely stated as the following: (1 /(n-k)) * ∏ (n-i)/(n - i + 1), i = (1,k) When expanded (and rearranged), it looks like this (for sufficiently large K):
((n-1)/n x (n-2)/(n-1) x … x (n-K+1)/(n-k+2) * (n-k)/(n-k+1)) * 1/(n-k)
For the base case (k = 0), it is trivial to see that 1/(n-k) = 1/n. For k>0, the numerator of each fraction can be cancelled with the denominator of the next, leaving the nn from the 0th draw as the only uncancelled denominator. Therefore, no matter on which draw an element is drawn, it is drawn with a 1/n chance, so each array permutation is equally likely to arise.
[3]:
import random
class Solution1(object):
"""
Time: O(N^2). The quadratic time complexity arises from the calls to list.remove (or list.pop),
which run in linear time. nn linear list removals occur,
which results in a fairly easy quadratic analysis.
Space: O(N). Because the problem also asks us to implement reset,
we must use linear additional space to store the original array.
Otherwise, it would be lost upon the first call to shuffle.
"""
def __init__(self, nums):
self.array = nums
self.original = list(nums)
def reset(self):
self.array = self.original
self.original = list(self.original)
return self.array
def shuffle(self):
aux = list(self.array)
for idx in range(len(self.array)):
remove_idx = random.randrange(len(aux))
self.array[idx] = aux.pop(remove_idx)
return self.array
[4]:
nums = [1, 2, 3]
solution = Solution1(nums)
solution.shuffle()
solution.reset()
solution.shuffle()
print(nums)
[3, 1, 2]
2. Fisher-Yates Algorithm [O(N), O(N)]¶
Intuition We can cut down the time and space complexities of shuffle with a bit of cleverness - namely, by swapping elements around within the array itself, we can avoid the linear space cost of the auxiliary array and the linear time cost of list modification.
Algorithm The Fisher-Yates algorithm is remarkably similar to the brute force solution. On each iteration of the algorithm, we generate a random integer between the current index and the last index of the array. Then, we swap the elements at the current index and the chosen index - this simulates drawing (and removing) the element from the hat, as the next range from which we select a random index will not include the most recently processed one. One small, yet important detail is that it is possible to swap an element with itself - otherwise, some array permutations would be more likely than others.
[7]:
class Solution2(object):
"""
Time: O(n). The Fisher-Yates algorithm runs in linear time, as generating a random index
and swapping two values can be done in constant time.
Space: O(n). Although we managed to avoid using linear space on the auxiliary array
from the brute force approach, we still need it for reset, so we're stuck with linear space complexity.
"""
def __init__(self, nums):
self.array = nums
self.original = list(nums)
def reset(self):
self.array = self.original
self.original = list(self.original)
return self.array
def shuffle(self):
for i in range(len(self.array)):
swap_idx = random.randrange(i, len(self.array))
self.array[i], self.array[swap_idx] = self.array[swap_idx], self.array[i]
return self.array
[8]:
nums = [1, 2, 3]
solution = Solution2(nums)
solution.shuffle()
solution.reset()
solution.shuffle()
print(nums)
[3, 2, 1]
[15]:
import random
class Solution3(object):
"""
shuffle ???
"""
def __init__(self, nums):
"""
:type nums: List[int]
:type size: int
"""
self.__nums = nums
self.original = list(nums)
def reset(self):
"""
Resets the array to its original configuration and return it.
:rtype: List[int]
"""
def reset(self):
self.__nums = self.original
self.original = list(self.original)
return self.__nums
def shuffle(self):
"""
Returns a random shuffling of the array.
:rtype: List[int]
"""
nums = list(self.__nums)
for i in range(len(nums)):
j = random.randint(i, len(nums)-1)
nums[i], nums[j] = nums[j], nums[i]
return nums
[16]:
nums = [1, 2, 3]
solution = Solution3(nums)
solution.shuffle()
solution.reset()
solution.shuffle()
print(nums)
[1, 2, 3]